Machine learning models rely on various assumptions to attain high accuracy. One of the preliminary assumptions of these models is the independent and identical distribution, which suggests that the train and test data are sampled from the same distribution. However, this assumption seldom holds in the real world due to distribution shifts. As a result models that rely on this assumption exhibit poor generalization capabilities. Over the recent years, dedicated efforts have been made to improve the generalization capabilities of these models collectively known as -- \textit{domain generalization methods}. The primary idea behind these methods is to identify stable features or mechanisms that remain invariant across the different distributions. Many generalization approaches employ causal theories to describe invariance since causality and invariance are inextricably intertwined. However, current surveys deal with the causality-aware domain generalization methods on a very high-level. Furthermore, we argue that it is possible to categorize the methods based on how causality is leveraged in that method and in which part of the model pipeline is it used. To this end, we categorize the causal domain generalization methods into three categories, namely, (i) Invariance via Causal Data Augmentation methods which are applied during the data pre-processing stage, (ii) Invariance via Causal representation learning methods that are utilized during the representation learning stage, and (iii) Invariance via Transferring Causal mechanisms methods that are applied during the classification stage of the pipeline. Furthermore, this survey includes in-depth insights into benchmark datasets and code repositories for domain generalization methods. We conclude the survey with insights and discussions on future directions.
translated by 谷歌翻译
近年来,无人驾驶飞机(UAV)在监视的背景下获得了重大吸引力。但是,从空中观察点捕获暴力和非暴力人类活动的视频数据集很少。为了解决这个问题,我们提出了一个新颖的基线模拟器,该模拟器能够生成参与各种活动的人群的光真实合成图像,这些序列可以归类为暴力或非暴力。人群组用使用语义分割自动计算的边界框注释。我们的模拟器能够产生大型的随机城市环境,并且能够在中端计算机上平均每秒保持25帧,并具有150个并发的人群相互作用。我们还表明,当来自现实世界数据增强所提出的模拟器的合成数据时,二进制视频分类精度平均提高了5%。
translated by 谷歌翻译